b'U.S. DEPARTMENT OF COMMERCE\n          Office of Inspector General\n\n\n\n\n                PUBLIC\n               RELEASE\n\n\n           BUREAU OF THE CENSUS\n           A Better Strategy Is Needed for\nManaging the Nation\xe2\x80\x99s Master Address File\n        Inspection Report No. OSE-12065/September 2000\n\n\n\n\n                            Office of Systems Evaluation\n\x0cU.S. Department of Commerce                                                                                   Report OSE-12065\nOffice of Inspector General                                                                                     September 2000\n\n                                                        Table of Contents\n\nEXECUTIVE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i\n\nINTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1\n\nPURPOSE AND SCOPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2\n\nBACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4\n\nOBSERVATIONS AND CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13\n\nI. Sufficient Time Not Available to Ensure High Quality Address Data . . . . . . . . . . . . . . . . . . 13\n       A.      High Levels of Data Quality Not Achieved for Initial DMAF . . . . . . . . . . . . . . . 13\n       B.      LUCA 98 Schedule Slips and Bureau Policy Reduced the Effectiveness of\n                Block Canvassing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15\n       C.      Insufficient Time Also Contributed to Erroneous Addresses\n               on the Initial DMAF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17\n\nII.       The Bureau Has Taken Steps to Improve Address Data Quality,\n          but More Should Be Done . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18\n          A.    The Bureau Identified Some Erroneous Addresses and\n                Flagged Them As Ineligible for Nonresponse Followup . . . . . . . . . . . . . . . . . . . 18\n          B.    The Bureau\xe2\x80\x99s Policy for Determining Address Eligibility\n                During the Census Is Not Well-Defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19\n          C.    The Bureau Should Use the MAF as a Management Tool . . . . . . . . . . . . . . . . . . 20\n\nIII.      Improved Software Engineering Standards Could Improve Data Quality . . . . . . . . . . . 21\n          A.    Consistency Between MAF and TIGER Not Easily Maintained . . . . . . . . . . . . . 21\n          B.    Software Engineering Standards Will Support Modernization . . . . . . . . . . . . . . 22\n\nIV.       Success in Meeting Housing Unit Accuracy and Completeness Goals\n          Should Be Reported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24\n          A.    The Bureau Has Housing Unit Coverage Goals and Evaluation Plans . . . . . . . . 24\n          B.    The Annual Performance Plan and Report Should Be Used to Report\n                Housing Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25\n\nRECOMMENDATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28\n\nAPPENDICES:                     A.         Acronyms Used in This Report\n                                B.         Bureau Response\n\x0cU.S. Department of Commerce                                                  Report OSE-12065\nOffice of Inspector General                                                    September 2000\n\n                                  EXECUTIVE SUMMARY\n\nThe 2000 Decennial Census enumerates the U.S. population and housing as of April 1, 2000.\nThis evaluation focuses on the Master Address File (MAF), which supplies addresses used to\nsupport operations responsible for mailing out questionnaires, enumerating nonresponding\nhouseholds, and controlling the collection and tabulation of population data. The quality of\nMAF addresses directly affects the accuracy, completeness, and cost of the decennial. In the\n1990 decennial, one-third of persons missed were not counted because address data for their\nhousing units was missing from the address file. Also, the General Accounting Office reported\nin the 1990 decennial that the bureau spent $317 million on operations to identify 4.8 million\nnonexistent housing units and 8.6 million vacant housing units and to remove the former from\nthe address file.\n\nTo address questions raised regarding how well programs to create the MAF for Census 2000\nwere working and what its quality would be at the start of the decennial, the bureau decided in\n1997 to reengineer the MAF-building strategy. The objectives of our evaluation were to (1)\ndetermine if steps taken to improve the MAF before delivery of the address file worked as\nplanned, (2) assess steps taken to identify and correct MAF data quality problems as the\ndecennial progressed, (3) determine whether the software development approach ensured high\nquality data, and (4) evaluate whether the data quality standard for the MAF provides a\nmeaningful benchmark for decision-makers. This evaluation focuses on city-style addresses,\nwhich comprise over 80 percent of the nation\xe2\x80\x99s residential addresses.\n\nThe bureau estimated through demographic methods that as of July 1998, there were 112.5\nmillion housing units nationwide. The addresses contained in the MAF represent the nation\xe2\x80\x99s\nhousing units and, together with each address\xe2\x80\x99s geographic location found in the Topologically\nIntegrated Geographic Encoding and Referencing (TIGER\xc2\xae) mapping system, provide an\nessential tool for collecting responses and counting people where they are located. Unless\nfound to be nonexistent or duplicate, an address equates to a housing unit in the decennial\ncensus. For the decennial, the bureau created the decennial MAF (DMAF), which contains all\nMAF addresses meeting decennial eligibility requirements such as being a residential address\nthat links, or geocodes, to a unique geographic location in TIGER.\n\nTo build and maintain the address file, the bureau has implemented operations designed to\ndecrease undercoverage (missed housing units) and overcoverage (duplicate, nonexistent, and\nother erroneous addresses). The bureau\xe2\x80\x99s reengineered strategy for city-style addresses allowed\nmore time for local governments to review MAF address lists and submit corrections. A key\ncomponent of the reengineering was 100 percent block canvassing. Designed to verify address\ndata provided by local governments and the Postal Service, bureau employees were to canvass\n100 percent of the blocks in an assigned area to confirm existing MAF addresses and add new\nones.\n\n                                                i\n\x0cU.S. Department of Commerce                                                   Report OSE-12065\nOffice of Inspector General                                                     September 2000\n\n\nThe bureau did not have sufficient time available to ensure high quality address data.\nReengineering of the MAF had to be completed in the two years before the addresses were\nneeded for labeling questionnaires and did not leave enough time to accommodate difficulties in\nreceiving addresses from local governments in time to be verified by block canvassing. We\nfound that incomplete address lists were used in block canvassing, reducing its effectiveness in\nimproving address quality. Moreover, the bureau did not have time to resolve questions about\nthe accuracy of over 5 million addresses and decided to include them and an unknown number of\nduplicate addresses in the decennial until more information became available. (See page 13.)\n\nThe bureau has taken steps to improve address quality and has potentially identified 10.2\nmillion nonexistent or duplicate addresses, approximately 4.3 million before nonresponse\nfollowup and an additional 5.9 million during nonresponse followup, while identifying and\nretaining 9.7 million vacant housing units. The official number of vacant and nonexistent\nhousing units will be determined after a subsequent field operation, which will provide\ninformation to confirm vacant and nonexistent addresses, convert them to occupied, or remove\nthem from the census. However, the policy for determining the addresses eligible for census\noperations has not been well-defined, and, at the time of our field work, the decision about\nwhich addresses to include in the final decennial results had not been made. Rather than being\npresented explicitly in the bureau\xe2\x80\x99s decision memorandum series that documents decennial\npolicy, address eligibility rules are implicit in software specifications, which often are not\nfinalized until data processing for the operation is imminent or underway. Finally, the bureau\ncould increase use of information already on the MAF to identify missing housing units and\npotential errors. (See page 18.)\n\nMAF addresses need to be linked to unique geographic locations (geocoded) to ensure that the\nbureau can count persons in their correct locations and that users of the data can accurately\nredraw congressional, state, and municipal legislative district lines. Because MAF and TIGER\nwere developed separately and are not integrated, consistency between them cannot be easily\nmaintained. TIGER data can be modified without ensuring that both databases have accurate and\nconsistent information, causing some decennial addresses to no longer link to TIGER. In\naddition, some decennial addresses received a geocode from block canvassing but do not link to\nTIGER. As of April 2000, 4.5 million decennial addresses did not have a current link to TIGER.\nAlthough these housing units will still be in the census, they risk being inaccurately located. The\nbureau needs to take steps to ensure that decennial addresses are geocoded accurately. In\naddition, in developing MAF and TIGER software, the bureau does not follow rigorous software\nengineering practices and therefore cannot ensure that all results are accurate. Software\nengineering standards should be used in the planned modernization of the systems that support\nthe MAF geocoding process and overall data quality. (See page 21.)\n\nThe bureau should report success in meeting housing unit accuracy and completeness goals. The\nbureau has created a housing unit coverage performance standard and methods to evaluate if it is\nmet. The goal for Census 2000 is to miss not more than 2.5 percent of existing housing units\n\n                                                ii\n\x0cU.S. Department of Commerce                                                    Report OSE-12065\nOffice of Inspector General                                                      September 2000\n\nand include not more than 1.5 percent in error, for a net undercoverage rate of 1 percent. The\nAccuracy and Coverage Evaluation will measure housing unit coverage. However, the bureau\nhas not clearly stated how it will report the standard and its success at meeting it. The Annual\nPerformance Plan and Program Performance Report mandated by the Government Performance\nand Results Act of 1993, in which agencies report performance goals, measures, and\naccomplishments to the President and Congress, are appropriate vehicles for reporting on this\nimportant data quality standard, including its separate overcoverage and undercoverage\ncomponents. (See page 24.)\n\nFor Census 2000, we recommend that the bureau (1) issue a decision memorandum that explains\nthe eligibility policy for the addresses to be included in the final decennial count, (2) ensure that\nany further TIGER changes made during the decennial are verified with the MAF so that no\nadditional decennial addresses lose their link to TIGER, and (3) report evaluation results\nmeasuring housing unit coverage, including its separate overcoverage and undercoverage\ncomponents. We make additional recommendations for future censuses and surveys designed to\nimprove the accuracy and completeness of the MAF and promote a rigorous software\nengineering approach to the modernization of MAF and TIGER. Finally, we recommend that\nthe bureau provide housing unit coverage standards and report on its progress toward meeting\nthem in future Government Performance and Results Act reporting. (See page 28.)\n\n\n\nIn its response to our draft report, the bureau stated that, with one exception, it concurs with or\nhad already acted upon our 11 recommendations. The exception regards a portion of\nRecommendation 10, implementing an annual national or small area MAF coverage\nmeasurement\xe2\x80\x94which the bureau stated would not be cost effective or practical to accomplish\nannually. We believe other methods, such as comparing tallies of MAF addresses to\ndemographic estimates as employed by Population Division, may serve as an alternative to more\ncostly measurements. Although the bureau concurred with Recommendation 5, to issue a\nmemorandum that explains the address eligibility policy for the final delivery of addresses to be\nincluded in Census 2000, we believe that the bureau needs to augment this \xe2\x80\x9cgeneral policy\xe2\x80\x9d\nwith specific criteria for determining which addresses are eligible for the decennial. Also, in\nconcurring with Recommendation 6, to use information in the MAF as a management tool to\nincrease the completeness and accuracy of the address file, the bureau described many\ninnovations but omitted additional techniques we believe should be used\xe2\x80\x94such as applying\naddress history data to locate geographic areas where addresses are likely to be missing and to\npinpoint addresses likely to be geocoded in error. Finally, the bureau provided comments on\nseveral aspects of our observations. Based on these comments, we have clarified the appropriate\nareas of the report. The bureau\xe2\x80\x99s response is included in its entirety as Appendix B to this\nreport.\n\n\n\n\n                                                 iii\n\x0cU.S. Department of Commerce                                                               Report OSE-12065\nOffice of Inspector General                                                                 September 2000\n\n                                                INTRODUCTION\n\nThe 2000 Decennial Census enumerates the U.S. population and housing as of April 1, 2000.\nReflecting a long tradition, Census 2000 will be the 22nd decennial enumeration in an unbroken\nchain that our nation has undertaken. The decennial provides information that describes the\nnation\xe2\x80\x99s population within small geographic areas and is used to apportion the U.S. House of\nRepresentatives. Decennial data is used to redraw congressional, state, and municipal legislative\ndistrict lines and provide the basis for determining the distribution of $200 billion of federal\nfunds. The decennial is the only data-gathering operation in the United States that is mandated\nby the Constitution.\n\nAn accurate and credible decennial depends on the Census Bureau\xe2\x80\x99s implementing a complex set\nof data collection operations and data processing systems that must work together to meet data\nquality standards. This evaluation focuses on the quality of addresses supplied to the decennial\nfrom the Master Address File (MAF). The MAF supplies the addresses used to support the\noperations responsible for mailing and hand-delivering questionnaires, enumerating\nnonresponding households, and controlling the collection and tabulation of Census 2000 data.\nThe MAF is often referred to as the heart of Census 2000 because it is intended to supply a\ncomplete list of living quarters used to identify all households that will be receiving a\nquestionnaire for mail return. Decennial addresses provided by the MAF become the contents of\nthe decennial master address file (DMAF). Responses received from the public are processed\nand merged with the DMAF to create the file that is the basis of decennial results. Decennial\naddresses must be precisely located based on geographic location information found in the\nbureau\xe2\x80\x99s Topologically Integrated Geographic Encoding and Referencing (TIGER\xc2\xae) system.1\nThe bureau must create a complete list of addresses that correctly identifies the block2 where the\naddress resides and excludes nonresidential, nonexistent, and duplicate addresses.\n\nThe quality of addresses chosen for the decennial is the cornerstone of the decennial\xe2\x80\x99s accuracy\nand completeness and an important determinant of its cost effectiveness. The bureau found that\nin the 1990 decennial, nearly one person in every three who were missed were not counted\nbecause their housing unit was missing from the address file.3 Also, address data in error\nincreases costs of operations for following up on people who did not respond to the\n\n\n        1\n            The remainder of this report refers to this system as TIGER.\n        2\n         A block is the smallest entity for which the bureau collects decennial information. A block is bounded\nby physical features and county boundaries.\n        3\n         Report to Congress \xe2\x80\x93 The Plan for Census 2000, Bureau of Census, August 1997. Stated in percentages:\n\xe2\x80\x9cBased on the1990 PES [Post-Enumeration Survey] results, 69.5 percent of the coverage error came from\nenumerated housing units and the remaining 30.5 percent came from housing units that were not enumerated at\nall.\xe2\x80\x9d\n\n                                                           1\n\x0cU.S. Department of Commerce                                                          Report OSE-12065\nOffice of Inspector General                                                            September 2000\n\nquestionnaire delivered to them. Also, the General Accounting Office (GAO) reported in the\n1990 decennial that the bureau spent $317 million on operations to identify 4.8 million\nnonexistent housing units and 8.6 million vacant housing units and to remove the former from\nthe address file. For this decennial, as of June 14, 2000, the bureau has identified 10.2 million\nnonexistent households, while retaining 9.7 million vacant housing units. Similar to the 1990\ndecennial, the bureau has implemented a second followup operation to verify nonexistent and\nvacant units. Costs to verify nonexistent households are not yet known.\n\n                                       PURPOSE AND SCOPE\n\nThe overall objectives of this evaluation were to determine the extent to which the bureau had\ncompiled complete and correct housing unit data by the time the data was needed for addressing\ncensus questionnaires and to determine the extent to which further corrective actions were\nneeded during the decennial. Our specific evaluation objectives were to (1) determine if steps\ntaken to improve the MAF before delivery of the address file worked as planned, (2) assess how\nwell the bureau identified and addressed MAF data quality problems as the decennial progressed,\n(3) determine whether the software development approach ensured high quality data, and (4)\nevaluate whether the data quality standard for the MAF provides a meaningful benchmark for\ndecision-makers.\n\nWe chose these objectives because during the 1998 Dress Rehearsal, housing unit data quality\ndid not meet all goals, and the bureau has undertaken several large operations to improve MAF\nquality, resulting in the need to assess the success of these operations and placing new demands\non software processing. We limited our evaluation to bureau-identified households in urban\nareas that receive mail at their address through the U.S. Postal Service.\n\nTo gain a high-level understanding of decennial plans and strategies, we reviewed the Report to\nCongress\xe2\x80\x93The Plan for Census 2000, originally issued July 1997 and revised August 1997. We\nalso reviewed the Census 2000 Operational Plan Using Traditional Census-Taking Methods,\ndated January 1999 and subsequent updates. In addition, we reviewed concerns about the time\nneeded by the bureau to obtain and verify address data provided by local governments raised in\nour prior report on the Local Update of Census Addresses (LUCA) Program.4\n\nTo accomplish our first objective, we reviewed the Census 2000 Address List Reengineering,\nCase for Change, dated September 24, 1997, to identify the plan, schedule, and goals for MAF\naccuracy and completeness. We also reviewed dress rehearsal MAF evaluation reports and\nsupporting documentation to obtain an understanding of issues pertaining to the quality of the\n\n\n        4\n        Additional Steps Needed to Improve Local Update of Census Addresses for the 2000 Decennial Census,\nIPE-10756, September 1998.\n\n                                                     2\n\x0cU.S. Department of Commerce                                                                Report OSE-12065\nOffice of Inspector General                                                                  September 2000\n\nMAF at the time of the dress rehearsal. To understand operations to improve the MAF, we\nreviewed plans for key improvement operations and bureau documentation about these\noperations. We also reviewed pertinent Commerce Inspector General and GAO reports.\n\nTo determine whether steps taken to improve the MAF before delivery of the address file\nworked as planned, we reviewed the study conducted by the bureau\xe2\x80\x99s Population Division,\nResults from the County Level Demographic Benchmark Analysis of the Decennial Master\nAddress File. The purpose of this study was to examine the accuracy of the housing unit counts\nbased on the DMAF addresses used to deliver questionnaires when compared to the division\xe2\x80\x99s\nindependent estimates of the numbers of housing units in each county. To learn more about\noperations used to build the MAF and TIGER, we interviewed decennial officials, including the\nAssistant Division Chief, Geographic Operations, and her staff and the Head of the Geographic\nPlanning and Budget Team. We also interviewed quality assurance and evaluation officials in\nthe Decennial Statistical Studies Division (DSSD), Geography Division, and Planning,\nResearch, and Evaluation Division (PRED). Many of those whom we interviewed were also\nmembers of the Address List Development Operations Planning Group, which is responsible for\nresolving decennial address issues as part of its overall charter to design all address list\ndevelopment activities and communicate the operational requirements.\n\nTo accomplish our second objective, we evaluated the decision-making process used to identify\naddresses for inclusion in the decennial. Our analysis of the bureau\xe2\x80\x99s address data was initiated\nby setting up several test cases as a result of identifying potential errors on TIGER map s and\nanalyzing corresponding addresses. For a 3.4 x 3.7-square-mile area of Prince Georges County,\nMaryland, we compared the TIGER map to a commercial map, visited areas with discrepancies\nbetween the two, and noted addresses on streets in question. We then queried the MAF to see if\nthese addresses were in the MAF and marked for inclusion in the decennial. We also reviewed a\nsmall sample of addresses that were included in the decennial even though they were listed as\nnonresidential. After physically verifying errors in data selected for the decennial, we traced\nthese errors to criteria in the specifications used for selecting data for inclusion in operations to\nprepare for and start the decennial. This information makes it possible to locate potential errors\nand determine whether they have broader significance.5 We then reviewed the bureau\xe2\x80\x99s plans\nfor resolving these addresses and spoke with officials in the Geography Division and DSSD\nresponsible for this work.\n\nTo accomplish our third objective, we interviewed the Assistant Division Chief, Geographic\nApplication Systems, and members of his staff and the Assistant Division Chief, Geoprocessing\nSystems, and members of his staff. We also interviewed the Branch Chief of the Geographic\n\n\n        5\n            This field work was confined to a small, local geographic area and was intended only to identify types\nof errors that can occur, not to provide any statistical quantification of the problem.\n\n                                                         3\n\x0cU.S. Department of Commerce                                                               Report OSE-12065\nOffice of Inspector General                                                                 September 2000\n\nProducts Quality Assurance Team and other quality assurance staff. We reviewed software\nspecifications and quality assurance practices followed in implementing the software that\nprocesses and produces address data for decennial operations, including operations to improve\nthe MAF. We interviewed officials responsible for improving and maintaining the housing unit\naddress data and for providing extracts of this data to support the decennial and other bureau\nprograms, such as the American Community Survey.6 In the Decennial Systems and Contracts\nManagement Office (DSCMO), we interviewed the program manager responsible for processing\nthe data provided from the MAF for decennial operations and the data obtained from those\noperations.\n\nTo accomplish our fourth objective, we learned about the bureau\xe2\x80\x99s efforts to gather, measure,\nand improve its housing unit data by reviewing memorandums and reports from the 1990\ndecennial and tests, operations, and studies designed to provide the bureau with information\nneeded to prepare for Census 2000. This documentation included 1998 Dress Rehearsal\nevaluations of the MAF and reports from the DSSD MAF Quality Improvement Program. We\nalso interviewed bureau officials from the Decennial Management Division, DSSD, and PRED\nregarding the status of the data quality standard for the MAF and the policy for reporting on\nprogress toward achieving it. We also reviewed the Government Performance and Results Act\nof 1993 to determine the requirements for reporting performance.\n\nThis inspection has been conducted in accordance with the Inspector General Act of 1978, as\namended, and the Quality Standards for Inspections, March 1993, issued by the President\xe2\x80\x99s\nCouncil on Integrity and Efficiency. Our field work was conducted from August 1999 through\nJune 2000 at the bureau\xe2\x80\x99s headquarters in Suitland and Upper Marlboro, Maryland, and selected\nareas in Prince Georges County.\n\n\n                                              BACKGROUND\n\nIn conducting the decennial, the bureau attempts to deliver a questionnaire to every household in\nthe country. To accomplish this task, the bureau needs to know the address for each housing\nunit. The MAF and TIGER are two databases that together contain the nation\xe2\x80\x99s addresses and\ntheir geographic locations. The MAF database contains both residential and nonresidential (for\nexample, business and religious organizations) addresses. To be valid for the decennial, an\naddress must be for a housing unit where people reside\xe2\x80\x94and, therefore, is typically residential.\n\n\n\n        6\n          The American Community Survey is designed to provide the data communities need every year instead\nof every 10 years. It is an ongoing survey that the bureau plans will replace the long form in the 2010 Census. See\nthe survey web site at http://www.census.gov/acs/www/ for more information.\n\n                                                        4\n\x0cU.S. Department of Commerce                                                   Report OSE-12065\nOffice of Inspector General                                                     September 2000\n\nThe bureau estimated, as of July 1998, that there were 112.5 million residential housing units\nthrough use of independent demographic analysis. A year later, approximately 120 million\nresidential addresses in the MAF were identified for use in the decennial. The number of MAF\nresidential addresses was higher than the 1998 estimates, in part, because the MAF contained\nduplicate and nonexistent addresses. Residential addresses are divided into two basic address\nstyles, city-style and non-city-style. A city-style address consists of a house number and street\nname\xe2\x80\x94101 Main Street, for example, with optional apartment number and direction such as\n\xe2\x80\x9cNorth.\xe2\x80\x9d A non-city-style address may have a delivery route number or box number.\nResidential addresses comprise approximately 80 percent city-style addresses and 20 percent\nnon-city-style addresses.\n\nIn areas where city-style addresses predominate, the TIGER database contains streets with\nassociated address ranges; in areas where non-city-style addresses predominate, TIGER contains\nroads and map spots to indicate the location of housing units. In TIGER, the entire country is\ndivided into discrete geographic areas called blocks. Each of the country\xe2\x80\x99s 3,142 counties has\nits own set of uniquely numbered blocks. To be valid for the decennial, a MAF address must be\nlinked to exactly one block in TIGER. The process of linking or assigning addresses to blocks\nis called geocoding.\n\nA third database, the Decennial Master Address File, incorporates address data from the MAF\nwith control data for tracking questionnaire responses submitted by the public. DMAF data\nlinks the public\xe2\x80\x99s responses to each housing unit to calculate housing unit participation in the\ndecennial. If an address is not in the DMAF at the start of the decennial, it must be added during\ndecennial operations to be included in the decennial results. These new addresses will also have\nto be added to the MAF and geocoded. Similarly, decennial operations identify addresses that\nshould not be in the decennial either because they are duplicates, because no such housing unit\nexists, or for other reasons. Such addresses are not physically deleted from the MAF or DMAF\nbut are flagged so that they are not included in further decennial operations or in the decennial\nresults. The DMAF is also used to provide address lists for operations such as Nonresponse\nFollowup (NRFU) in which enumerators attempt to find nonrespondents and Coverage\nImprovement Followup (CIFU), in which enumerators, among other things, attempt to find and\nenumerate households not included in NRFU.\n\nMAF, TIGER, and DMAF are separate databases that contain address data. They are each\nupdated as new address data is provided through decennial operations. Because data from all\nthree is used to implement and manage decennial operations, consistency between them is\nessential to producing accurate results for Census 2000.\n\nA critical milestone for use of housing unit data in the decennial occurred in July and August\n1999, when the initial DMAF was created with address data extracted from the MAF. The July\n\n                                                5\n\x0cU.S. Department of Commerce                                                  Report OSE-12065\nOffice of Inspector General                                                    September 2000\n\n1999 and supplemental August 1999 MAF extract for the DMAF provided the initial universe of\nhousing units for the decennial. This address data was used to label questionnaires for eventual\nmail and hand delivery to the nation\xe2\x80\x99s housing units in time for Census Day\xe2\x80\x94 April 1, 2000.\n\nStrategies to Build the MAF\n\nIn 1992, the Congress mandated a study of the fundamental requirements for the nation\xe2\x80\x99s\ndecennial census by the National Academy of Sciences\xe2\x80\x99 National Research Council. The\ncouncil recommended that the bureau develop cooperative arrangements with local/tribal\ngovernments to improve its address data. Section 9 of Title 13 authorizes the bureau to protect\nthe confidentiality of the persons from whom it collects data, and in general prohibits the\nsharing of any individual\xe2\x80\x99s data collected by the bureau. To allow the bureau to implement the\nCouncil\xe2\x80\x99s recommendation, the Congress passed the Census Address List Improvement Act of\n1994, which amended Title 13 to require the bureau to solicit address list feedback from\nlocal/tribal governments. This act also mandated that the bureau use the U.S. Postal Service\xe2\x80\x99s\nDelivery Sequence File, a nationwide list of individual mail delivery points.\n\n       The Initial Plan\n\nIn September 1997, the bureau began to build the MAF for the entire nation by combining the\n1990 decennial address list with the Postal Service file for counties with city-style addresses.\nAfter merging the two files, all addresses were submitted to automated processing that geocoded\nthem to TIGER. Some addresses could not be geocoded by automated means because TIGER\ndata did not include the house number within the address range for that street or the street was\nmissing. These addresses were sent to an operation called the MAF Geocoding Office\nResolution, which researched the correct information for the locale and updated TIGER so the\naddress could be geocoded. To incorporate address changes or additions reflected by the Postal\nService data, the bureau has updated the MAF periodically with later versions of the Postal\nService file. The geocoding process occurs as part of each such update.\n\nTo obtain local and tribal government input, the bureau created the Program for Address List\nSupplementation. Through this program, the bureau invited local governments to submit lists of\ncity-style addresses to obtain any that still might be missing or listed incorrectly. The bureau\nplanned field operations to verify discrepancies between existing MAF data and that provided\nby the program. The bureau also planned targeted canvassing to identify housing units whose\naddresses were missing from the MAF. Finally, a Local Update of Census Addresses (LUCA)\nprogram to be started about a year before creating the initial DMAF would provide an\nopportunity for participating local/tribal governments to review the address lists and provide\nupdates and corrections.\n\n\n                                                6\n\x0cU.S. Department of Commerce                                                            Report OSE-12065\nOffice of Inspector General                                                              September 2000\n\n       The Reengineered Plan\n\nEven before the merging operation that created the initial MAF, the bureau realized that its\ncurrent plan would result in an address list with rates of overcoverage (erroneous addresses) and\nundercoverage (missed housing units) that would burden decennial operations. The bureau\nformed a team with members from various divisions to develop an approach to achieving the\nmost accurate list possible, using local and tribal government involvement. Time to implement\nthe plan was short\xe2\x80\x94within a little over two years, the bureau had to supply addresses to the\nprinters to label questionnaires. The bureau documented its new plan in September 1997.7\n\nFor city-style addresses, the new approach allowed more time for local governments to review\nMAF address lists and submit additions, deletions, or other corrections in a program now called\nLUCA 98. This new program would assist governments that do not maintain address data as\nmailing lists. To verify address data provided by local governments and account for\ndeficiencies in the Postal Service file, the bureau changed its plan from targeted canvassing to\n100 percent block canvassing by bureau employees, an operation considered critical to\nachieving a database with uniformly high quality. For the block canvassing operation, listers\nwere to canvass 100 percent of their assigned area and conduct brief interviews at approximately\nevery third housing unit, every multi-unit structure, and all added housing units.8 Block\ncanvassing was called a \xe2\x80\x9cdependent listing\xe2\x80\x9d because listers used lists of addresses that the\nbureau generated from the MAF for their specific assignment area. The listers were to compare\neach address with those on the list, mark correct ones as verified, and record all additions,\ndeletions, and corrections. In addition, the listers were to update TIGER maps. The bureau\nbelieved that block canvassing was the only method that could identify and correct all types of\naccuracy problems. A separate reconciliation operation was planned for LUCA 98 results that\ndiffered from bureau results.\n\nTo ensure the completeness of city-style addresses, the bureau also planned a Postal Service\nvalidation, which would consist of the postal carriers placing pre-addressed cards in their mail\nsorting cases to identify either undeliverable or missing addresses. The bureau believed that the\nPostal Service validation would update the MAF with new residential construction occurring in\n1999 and early 2000 that was missed by other operations. The check was to be conducted as\nclose to Census Day as possible.\n\n\n\n\n       7\n           Census 2000 Address List Reengineering, Case for Change, September 24, 1997 .\n       8\n           Census 2000 Block Canvassing Program Master Plan, Bureau of Census, March 1999.\n\n                                                       7\n\x0cU.S. Department of Commerce                                                                Report OSE-12065\nOffice of Inspector General                                                                  September 2000\n\n        Changes to the Reengineered Plan\n\nThe bureau subsequently modified some aspects of the reengineered plan. It replaced the Postal\nService validation with a letter carrier review and correction of addresses in the summer of 1999\nand again in January 2000.9 The bureau made this decision after months of evaluation and\ndiscussions with the Postal Service. The new plan allowed earlier incorporation of new\naddresses and reflected actions taken by the Postal Service to increase the currency and accuracy\nof the Postal Service file. The bureau also added a program to allow local/tribal governments to\nidentify newly constructed housing units starting in January 2000. This program responded to\nconcerns raised by local/tribal governments that housing units constructed between January 2000\nand Census Day would not be included the decennial.\n\nBureau Tests and Evaluations Support the Decision to Revise the MAF-building Approach\n\nBureau evaluations and tests conducted before and after the decision to alter the strategy\nsupported the need to improve MAF accuracy and completeness. The reengineering plan cited\ndata quality goals for the MAF that called for overcoverage of 1.5 percent and undercoverage of\n2.5 percent, for a net undercoverage of 1 percent.10 However, according to this plan, the bureau\xe2\x80\x99s\nexperience with using the combined Postal Service file and 1990 address list during the 1995\nCensus Test revealed that the bureau was not meeting its goals. According to the reengineering\nplan, the 1995 test of two cities showed undercoverage ranging from 3.9 to 6.7 percent and\novercoverage ranging from 6.0 to 9.0 percent. Further, the data indicated that coverage problems\nwere worse in multi-unit structures, where undercoverage ranged from 4.0 to 7.9 percent and\novercoverage ranged from 5.7 to 9.4 percent, depending on the size of the structure.\n\nTwo other evaluations conducted by PRED and DSSD during 1997 and 1998 also found\ncoverage problems.11 The 1997 MAF Quality Improvement Program Pilot Study, which\n\n\n        9\n            In addition, the bureau used updated Postal Service data to update the MAF in February 2000.\n        10\n            The reengineering plan cited no more than 1 percent duplicates as part of the overcoverage measurement\ngoal. We estimated the total overcoverage measurement goal, including duplicates and other nonexistent housing\nunits, as 1.5 percent based on the bureau\xe2\x80\x99s goal for missing housing units of 2.5 percent and net housing unit\nundercoverage measurement of 1 percent.\n        11\n          1997 Master Address File Quality Improvement Program Pilot Study, Bureau of the Census, PRED,\nApril 1999 and 1998 Master Address File Quality Improvement Program, Bureau of the Census, PRED, June\n1999. The purpose of the 1997 pilot study was to test the operational feasibility of using the Integrated Coverage\nMeasurement (ICM) methodology to measure the accuracy and completeness of the initial MAF. The pilot\nconcluded that with a few modifications, the ICM operational methodology worked for the 1997 Quality\nImprovement Program.\n\n                                                         8\n\x0cU.S. Department of Commerce                                                               Report OSE-12065\nOffice of Inspector General                                                                 September 2000\n\nevaluated six counties with high rates of geocoded addresses, found undercoverage rates ranging\nfrom 5.1 percent to 26.9 percent and overcoverage rates ranging from 6.7 percent to 18.2\npercent.12\n\nThe 1998 MAF Quality Improvement Program Study also found large error rates and concluded\nthat they confirmed the need for significant improvement in MAF-building operations before the\ndecennial. 13 The 1998 study included estimates of geocoding errors to account for housing units\nmissing from a block because they were assigned to the wrong block. At the national level, the\nstudy found a 9-percent undercoverage rate and a 13-percent overcoverage rate. The study also\nfound that 6 percent of addresses were geocoded to the wrong block, and 6 percent of existing\nhousing units with addresses in the MAF were not geocoded. At a regional level, undercoverage\nranged from 5 to 16 percent and overcoverage ranged from 8.5 to 16 percent. From 2.5 to 11\npercent of addresses were geocoded to the wrong block, and from 2 to 12 percent of existing\nhousing units with addresses in the MAF were not geocoded. In the six counties that were\nevaluated, undercoverage estimates range from 3 to 7 percent and overcoverage estimates range\nfrom 7 to 36 percent. From 2 to 7.5 percent of addresses were geocoded to the wrong block, and\nfrom 0.1 to 1.5 percent of existing housing units with addresses in the MAF were not geocoded.\n\nThe bureau also evaluated MAF housing unit undercoverage and overcoverage experienced\nduring the 1998 Dress Rehearsal. 14 Undercoverage estimates at the dress rehearsal sites ranged\nfrom 2.9 to 24.6 percent, and overcoverage estimates ranged from 2.9 to 14.8 percent.\n\nAnother type of dress rehearsal evaluation of the DMAF was done by the Population Division to\nexamine the consistency of housing totals. This evaluation compared the numbers of addresses\nin the DMAF for a specific county to an independent demographic benchmark calculated by\nusing the results of the 1990 decennial, then adding the number of new housing units and\nsubtracting the number demolished. DMAF tallies differed widely from demographic\nbenchmarks in South Carolina where the DMAF did not retain housing units with mailing\naddresses that were post office boxes and did not adequately obtain address data for newly\nconstructed housing units. In areas not experiencing these problems, DMAF address totals were\n\n\n\n        12\n           Because this study only used residential geocoded addresses, it did not measure the extent to which\ncoverage errors were caused by coding errors. These types of errors include (1) geocoding errors, an address coded\nto the wrong block, which erroneously decreases housing units on one block while increasing them on others; (2)\nungeocodables, addresses on the MAF but not geocoded; and (3) nonresidential coding errors, addresses that are\nincorrectly coded nonresidential. This study also did not specifically look at coverage in multi-unit structures.\n        13\n             1998 Master Address File Quality Improvement Program\n        14\n         Results of the Housing Unit Matching Phase of the Integrated Coverage Measurement, Bureau of\nCensus, DSSD ICM Dress Rehearsal Results Memorandum Series Number HU-1, September, 1998.\n\n                                                        9\n\x0cU.S. Department of Commerce                                                         Report OSE-12065\nOffice of Inspector General                                                           September 2000\n\nbroadly consistent with the independent demographic benchmarks. These results indicate that\nbenchmarks provide a useful tool for evaluating the MAF and DMAF.15\n\nDecennial Operations Rely on and Improve the MAF, DMAF, and TIGER\n\nDuring the decennial, response collection operations use address data and can add to the\naccuracy and completeness of the MAF, TIGER, and the DMAF. In Nonresponse Followup\xe2\x80\x94a\ndecennial operation that occurs after the initial period when responses are returned from the\npublic\xe2\x80\x94temporary field staff, called enumerators, visit housing units with addresses for which\nthe bureau has not recorded a response. The bureau provides enumerators with lists of\ninformation from the DMAF for both respondents and nonrespondents. Enumerators use the\nlists to find nonrespondents and to help them identify whether respondents may have completed\na form for the wrong address and if so, to locate the true nonrespondents. Enumerators then\nattempt to interview a household member to obtain resident and housing unit information.\n\nThe Geography Division is responsible for the data processing that identifies the addresses\neligible for NRFU, and DSCMO is responsible for the data processing that generates the NRFU\nuniverse of nonrespondents and the address lists containing the data to produce enumerator\nwork lists. This processing occurred during March and April 2000. In addition to finding\nnonrespondents, enumerators also identify vacant housing units as vacant and nonexistent\naddresses as deletes. To the extent that nonexistent addresses are identified and flagged on the\nMAF and the DMAF, overcoverage is decreased.\n\nAlthough enumerators focus on obtaining missing responses rather than looking for missing\nunits, when they do discover the latter, they record pertinent data in the address register and\nattempt to complete a questionnaire for the household. If the address recorded on the completed\nquestionnaire can be geocoded and is not already in the MAF, it is added. To the extent that\nNRFU adds valid addresses to the MAF, undercoverage is decreased.\n\nDuring Coverage Improvement Followup\xe2\x80\x94an operation that followed NRFU\xe2\x80\x94the bureau\nenumerates new housing units found during update/leave, housing units associated with lost or\nblank questionnaires, partially completed Be Counted and Telephone Questionnaire Assistance\nquestionnaires, and new addresses or incomplete responses that were obtained too late to be in\nNRFU.16 CIFU also verifies some housing units classified as vacant or nonexistent in earlier\ndecennial operations. Similar to NRFU, Geography identified addresses eligible for CIFU, and\nDSCMO produced the CIFU universe. This processing occurred during June 2000. Similar to\n\n\n       15\n            Census 2000 Dress Rehearsal Evaluation Summary, Bureau of Census, PRED, August 1999.\n       16\n            Coverage Improvement Followup Program Master Plan, Bureau of Census, October 1999.\n\n                                                    10\n\x0cU.S. Department of Commerce                                                 Report OSE-12065\nOffice of Inspector General                                                   September 2000\n\nNRFU, enumerators identify vacant housing units as vacant and nonexistent housing units as\ndeletes. To the extent that nonexistent addresses are identified and flagged on the MAF and the\nDMAF, overcoverage is decreased. If a missing housing unit is found, the enumerator will add\nthe address and attempt to enumerate the household. To the extent that CIFU adds valid\naddresses to the MAF, undercoverage is decreased.\n\nTwo operations, the Be Counted Program and Telephone Questionnaire Assistance, although\nnot designed specifically to identify housing units, can result in new addresses. If a housing\nunit or households within a housing unit have not received a questionnaire from the bureau,\nthey can obtain an unaddressed Be Counted questionnaire. Similarly, a household can call a\ntoll-free number to submit a response regardless of whether they received a questionnaire in the\nmail. In both programs, if the address provided with the response can be geocoded and is not\nalready in the MAF, it will be added. Although the intent of these operations is to count people\nwithin housing units who are not included in the return for that unit, to the extent that valid\naddresses are added to the MAF, housing unit undercoverage is decreased.\n\nComputer Systems Process Information for Decennial Operations\n\nThe bureau has developed software to extract information from MAF and TIGER to update the\nDMAF and to process data obtained from decennial operations to update MAF and TIGER for\nfurther decennial operations. Software was used to merge the Postal Service file with the 1990\ndecennial address list to produce the initial MAF; reconcile TIGER with the MAF; produce\naddress lists for LUCA 98, block canvassing, and LUCA 98 Field Verification; merge data\ncollected from LUCA 98, block canvassing, and Postal Service file updates with MAF and\nTIGER; and produce MAF extracts for creating and updating the DMAF. MAF extracts are\nplanned for many DMAF updates during the decennial. Final housing unit selection from the\nMAF is delivered to headquarters processing for merging with the response data. The results of\nthat processing are submitted for the selection of housing units to be included in decennial\nresults. Figure 1 diagrams these components. Critical system subcomponents include the\nmatching and merging software and the geocoding software. The matching and merging\nsoftware updates existing address records or adds new address records with data obtained from\ndecennial operations. The geocoding software links an address in the MAF with a unique block\n(geographic location) in TIGER.\n\n\n\n\n                                               11\n\x0cU.S. Department of Commerce                      Report OSE-12065\nOffice of Inspector General                        September 2000\n\n\nFigure 1. System Components for MAF and TIGER\xc2\xae\n\n\n\n\n                                     12\n\x0cU.S. Department of Commerce                                                 Report OSE-12065\nOffice of Inspector General                                                   September 2000\n\n\n\n                          OBSERVATIONS AND CONCLUSIONS\n\nI. Sufficient Time Not Available to Ensure High Quality Address Data\n\nAlthough the bureau made a strong case for reengineering its approach to building the MAF and\nspent close to $100 million for the necessary operations, making that decision just two years\nbefore the addresses were needed did not leave enough time to carry out the new plan. A bureau\nstudy of address quality in the initial DMAF raised concerns about high levels of housing unit\novercoverage and undercoverage but was not designed to report on the cause. We found that\nLUCA 98 schedule slips and bureau policy regarding which addresses were eligible to be\nincluded in block canvassing reduced the effectiveness of this major operation designed to\nimprove address quality. In addition, the bureau did not have sufficient time to resolve\nconflicting LUCA 98 and block canvassing addresses before including these and other unresolved\naddresses on the initial DMAF for questionnaire addressing. Housing unit overcoverage, caused\nby including over 5 million unresolved addresses on the DMAF, increased the burden on\ndecennial operations to resolve nonexistent and duplicate addresses.\n\nThe bureau created the MAF by updating the 1990 address file with Postal Service and local/tribal\naddress information and plans to update and use the MAF resulting from the 2000 decennial as\nthe basis for future surveys and censuses, including the 2010 decennial. Using its experiences\nfrom the 2000 decennial as a guide, the bureau needs to develop an improved approach for\nupdating the MAF that includes sufficient time to conduct MAF-building operations designed to\ndetect and resolve overcoverage and undercoverage and that incorporates a clear definition of\naddresses eligible for these operations.\n\nA.     High Levels of Data Quality Not Achieved for Initial DMAF\n\nThe Population Division compared its 1998 estimates of housing unit coverage to numbers of\naddresses in the initial DMAF to find potential undercoverage and overcoverage problems. That\ncomparison showed that a year after it had estimated a total of 112.5 million housing units\nnationwide, the initial DMAF contained 120.2 million units\xe2\x80\x94a difference of 7.7 million or 7\npercent. Knowing that the national average can vary considerably at a county level and wanting\nto provide a tool for flagging counties with potential overcoverage and undercoverage, analysts\ncompared the benchmarks with the total contained in the initial DMAF for each county. The\n\n\n\n\n                                               13\n\x0cU.S. Department of Commerce                                                                   Report OSE-12065\nOffice of Inspector General                                                                     September 2000\n\nmethodology separated counties by type of enumeration area17 and tallied the number of\naddresses for each county. The analysis found that of 148 counties with only city-style\naddresses, 57 percent had measurable overcoverage or undercoverage.18 Similarly, of 1,499\ncounties with a combination of city-style and non-city style addresses, 67 percent had\nmeasurable overcoverage or undercoverage.19 As noted previously, housing unit coverage will\nimprove during the decennial\xe2\x80\x94bureau operations will add housing units and eliminate\nnonexistent and duplicate housing units. In addition, housing unit estimates may not adequately\nreflect fluctuations in population for some counties. However, such a high percentage of\ncounties with coverage discrepancies indicates significant data quality problems at the start of the\ndecennial.\n\nFor the purpose of the Population Division\xe2\x80\x99s analysis, counties were considered to have\nundercoverage if the DMAF count was below the 1998 estimate. In these counties, the housing\nunit count is expected to fall even more after the normal process of removing duplicate and\nnonexistent addresses. Unless update operations add housing units to make up the difference,\nthere will be a shortfall. The study identifies counties where the DMAF is 0 to 5 percent higher\nthan the 1998 estimate as also having the potential to fall below the 1998 estimate after duplicate\nand nonexistent housing units are removed during the decennial.\n\nFor the same analysis, counties were considered to have overcoverage if the DMAF count was\nabove the 1998 estimate by 20 percent. In these counties, the bureau expects a larger than normal\n\n\n         17\n          Types of Enumeration Areas (TEAs) include TEA 1- block canvassing and Mailout/Mailback; TEA 2 -\nAddress Listing and Update/Leave; TEA 3 - List/Enumerate; TEA 4 - Remote Alaska; TEA 5 - \xe2\x80\x9cRural\xe2\x80\x9d Update/\nEnumerate; TEA 6 - Military; TEA 7 - \xe2\x80\x9cUrban\xe2\x80\x9d Update/Leave; TEA 8 - \xe2\x80\x9cUrban\xe2\x80\x9d Update/Enumerate; and TEA 9 -\nAdditions to Address Listing Universe of Blocks. For further discussion of operations for each TEA, see Census\n2000 Operational Plan Using Traditional Census-taking Methods, Bureau of the Census, January 1999.\n         18\n            Nationwide, there are 3,142 counties. In addition to the 148 city-style and 1499 city-style and non-city-\nstyle counties cited above, the study included 818 counties that had all non-city-style addresses and 460 counties that\nhad a combination of city-style, non-city-style, and other TEAs. The report also did not cover another 217 counties,\nprimarily because these counties\xe2\x80\x99 addresses were obtained through other than MAF-generated activities. We included\nthe bureau\xe2\x80\x99s analysis of counties with only TEA 1 and combination of TEAs 1and 2. We did not include the\nresults for TEA 1 in combination with other TEAs. For example, the results from Cook County, Illinois and\nPhiladelphia County, Pennsylvania were not included because these counties are a combination of TEAs 1 and 7.\n         19\n           The study results are documented in Count Review Memorandum Series 99-01, Subject: Results from\nthe County Level Demographic Benchmark Analysis of the Decennial Master Address File\xe2\x80\x93Part A: Differences 5\nPercent or below for Selected Types of Enumeration Areas, January 10, 2000, and Count Review Memorandum\nSeries 99-02, Subject: Results from the County Level Demographic Benchmark Analysis of the Decennial Master\nAddress File\xe2\x80\x93Part B: Differences in Excess of 10 Percent for Selected Types of Enumeration Areas, February 10,\n2000, issued by the bureau\xe2\x80\x99s Population Analysis and Evaluation Staff, Population Division.\n\n                                                          14\n\x0cU.S. Department of Commerce                                                             Report OSE-12065\nOffice of Inspector General                                                               September 2000\n\nnumber of deleted addresses resulting from removing duplicate and nonexistent addresses. The\nstudy also identifies counties where the DMAF is 10 to 20 percent higher that the 1998 estimate\nas counties with a similar potential for a higher number of deleted addresses resulting from\nremoving duplicate and nonexistent housing units. Table 1 shows the results of the analysis,\nincluding the total number of counties in each category and overcoverage and undercoverage\ncombined.\n\nTable 1: Demographic Evaluation of Initial DMAF Undercoverage and Overcoverage by\nCounties Having City-Style and Combination City-Style and Non-City-Style Addresses\n\n                                  Counties with                  Counties with\n                                  Overcoverage                  Undercoverage                Total\n     Type of Address\n                                                                                         Undercoverage\n     and Number of             Exceeds                                      Exceeds\n                                             Exceeds                                         and\n        Counties               1998 by                       Under 1998       1998\n                               20% or\n                                            1998 by 10\n                                                              Estimate    estimate by    Overcoverage\n                                              to 20%\n                                higher                                        0-5%\n\n\n City-Style\xe2\x80\x94148 counties       10 (7%)       27 (18%)          6 (4%)      41(28%)           84 (57%)\n\n\n City-Style and Non-City-\n                                37 (3%)       264 (18%)      163 (11%)    543 (36%)        1007 (67%)*\n Style\xe2\x80\x941,499 counties\n*Percentage total difference due to rounding.\n\nThe results for the counties cited above raise questions about the success of reengineering in\nmeeting overcoverage and undercoverage goals. Reengineering in some counties appears to\nhave resulted in undercoverage rates comparable to those experienced at the time of the dress\nrehearsal. Given that the 100 percent block canvassing component of the reengineering was\nintended to improve the MAF beyond levels experienced during the dress rehearsal, we are\nrecommending that the bureau explore what caused overcoverage and undercoverage and use\nthe results as lessons learned when planning future MAF improvement operations.\n\nB.      LUCA 98 Schedule Slips and Bureau Policy Reduced the Effectiveness of\n        Block Canvassing\n\nThe bureau designed block canvassing so that it would include all LUCA 98 addresses and verify\nboth LUCA 98 and Postal Service address data. Prior bureau evaluations indicate that erroneous\naddresses are more likely to be corrected by field operations if they are included on the address\n\n\n\n\n                                                        15\n\x0cU.S. Department of Commerce                                                            Report OSE-12065\nOffice of Inspector General                                                              September 2000\n\nlists used during the operation.20 However, not all LUCA 98 or MAF addresses were included in\nblock canvassing. Thus, block canvassing enumerators did not have complete housing unit\naddress lists, making it difficult to find missing addresses and areas not covered effectively by the\nPostal Service. As bureau officials have explained to us, although block canvassing is designed\nto be a 100-percent verification of assigned areas, listers often rely too heavily on address lists\nand associated TIGER maps and do not find areas where addresses are not listed or where streets\nare missing on the map s. Not including all MAF addresses in block canvassing resulted in\nunverified addresses being included on the initial DMAF.\n\nWe found that block canvassing lists were incomplete for two reasons: (1) the LUCA 98\noperation took longer than the bureau estimated, resulting in approximately 98.6 percent of\nLUCA 98 addresses that were not available in time to be included in block canvassing, and (2)\nblock canvassing address lists did not include some MAF addresses from the Postal Service file,\nas well as other addresses that were not geocoded.21 Specifically, MAF addresses added by the\nNovember 1997 Postal Service file but not included on the September 1998 Postal Service file\nwere omitted from block canvassing address lists. Bureau officials told us that there was not a\nclear policy on which addresses should be included for block canvassing, and at that time, a\ndecision was made not to use these addresses, which the Postal Service no longer considered\nvalid. However, this decision was not consistent with the later decision to include these same\naddresses in the initial DMAF. Addresses with missing geocodes were not included on block\ncanvassing lists because bureau officials believed that addresses without block numbers would be\ndifficult to find. However, since other identifying information\xe2\x80\x94such as the street address, city,\nstate, and zip code\xe2\x80\x94was available, we believe that these addresses could also have been verified.\n\nWe are recommending that in establishing its strategy for updating the MAF for future surveys\nand censuses, the bureau ensure that sufficient time is planned for MAF improvement operations.\nWe are further recommending that this strategy include developing a consistent policy for\naddress eligibility for these improvement operations and an approach for verifying addresses\nwithout block codes.\n\n\n\n\n        20\n          As cited in Additional Steps Needed to Improve Local Update of Census Addresses for the 2000\nDecennial Census, IPE-10756, September 1998.\n        21\n          2000 Census, Local Address Review Program Has Had Mixed Results to Date, GAO Testimony,\nStatement of J. Christopher Mihm, Associate Director, Federal Management and Workforce Issues, General\nGovernment Division, GAO/T-GGD-99-184, September 29, 1999 and Census Bureau Geography Division data\nspreadsheet.\n\n                                                     16\n\x0cU.S. Department of Commerce                                                                Report OSE-12065\nOffice of Inspector General                                                                  September 2000\n\nC.      Insufficient Time Also Contributed to Erroneous Addresses on the Initial DMAF\n\nIn order to minimize the chances of omitting valid addresses from the decennial, the bureau\nincluded addresses that MAF operations had indicated were in error or that had conflicting\nindications from different operations on the initial DMAF. This inclusive approach allowed 5.2\nmillion unresolved addresses to be delivered in July 1999 to the printing operation that labeled\nquestionnaires for mail delivery. Unresolved addresses were those that block canvassing\nrecommended be deleted because they were nonexistent, uninhabitable, or nonresidential or that\nhad been added by LUCA 98 but not by block canvassing . The bureau decided that unresolved\naddresses would be reviewed under a LUCA 98 Field Verification operation.22 However, this\noperation was not completed until several months after the address file was delivered to the\nprinter. We believe that this approach contributed to the high amount of overcoverage reported\nin the Population Division study discussed in the first part of this observation.\n\nThe bureau\xe2\x80\x99s inclusive approach also resulted in including an unknown number of duplicate\naddresses in the decennial. We found two reasons why duplicates occurred. First, some\naddresses identified as duplicates by block canvassing were included because the results\nconflicted with LUCA 98. Second, a decision to alter software processing so that address\nupdates of individual apartments, trailer park lots, and certain other housing unit types would\nnot be incorrectly merged into a single address allowed duplicates of other addresses to occur.23\n\nThe high level of undercoverage and overcoverage that occurred due to the lack of time to verify\naddresses underscores the need for the bureau to ensure that sufficient time is allotted for MAF\nimprovement operations in the future. As the bureau continues to update the MAF with Postal\nService or local/tribal address information, preventing duplicate addresses will remain a\nchallenge. Therefore, we are recommending that the bureau study the causes of duplicate\naddresses and implement methods to prevent them.\n\n\n\n\n        22\n            LUCA 98 addresses that were not accepted by the bureau during block canvassing were to be rechecked\nduring a subsequent inspection operation called reconciliation, where bureau employees were to verify the block\ncanvass results between March and August 1999. However, the delays in LUCA 98 created the complication of some\nLUCA 98 addresses not included in block canvassing having to be matched to block canvassing results to determine\ntheir status. This process and other delays resulted in the delay of the reconciliation to verify LUCA 98 addresses.\n        23\n           Block Canvassing Address Merge Rules, Memorandum from Robert Marx, Chief, Geography Division\nto Distribution List, June 25, 1999.\n\n                                                        17\n\x0cU.S. Department of Commerce                                                            Report OSE-12065\nOffice of Inspector General                                                              September 2000\n\nII.    The Bureau Has Taken Steps to Improve Address Data Quality,\n       but More Should Be Done\n\nThe bureau took steps to identify erroneous addresses before starting NRFU. However, similar\nto the bureau\xe2\x80\x99s lack of a clear policy on address eligibility for MAF-building operations, its\npolicy for defining address eligibility during the decennial is also not well-defined. In\nparticular, the specification that determines which housing units will be included in the final\ndecennial count has not been issued. The bureau could also make better use of information\nalready on the MAF to identify missing housing units and potential geocoding errors where a\nMAF address does not link to the correct TIGER location. To improve address data quality for\nCensus 2000 and future censuses and surveys, the bureau should (1) define a clear and visible\npolicy for determining address eligibility during the decennial and (2) devise methods of using\nMAF address history information to guide coverage improvement operations.\n\nA.     The Bureau Identified Some Erroneous Addresses and\n       Flagged Them As Ineligible for Nonresponse Followup\n\nIn our review of a sample of addresses in Prince Georges County, Maryland, we found evidence\nof systemic overcoverage that we believe need to be quantified and targeted by the bureau. The\novercoverage included both duplicate addresses and addresses for nonexistent or uninhabitable\nhousing units.\n\nFor example, we found a street with 28 townhouses, of which 26 had their addresses listed twice\nin the MAF. Initially, addresses for the 28 townhouses were added to the MAF with errors\ncaused by a slightly incorrect spelling obtained from the Postal Service file and an incorrect\nblock number caused by an error in TIGER. LUCA 98 added correct addresses and block\nnumbers for 26 of these townhouses. All 54 addresses were included in the initial DMAF.\nHowever, as we pointed out earlier, the bureau\xe2\x80\x99s software had been modified to merge fewer\nduplicates and thus allowed more of these duplicate addresses in the initial DMAF.\n\nAs we also pointed out earlier, another reason for overcoverage was the lack of a clear policy on\nwhich addresses should be included for block canvassing. For example, we found an apartment\ncomplex of some 500 uninhabitable units whose MAF addresses were excluded from block\ncanvassing lists but included in the initial DMAF. The September 1998 Postal Service file did\nnot contain these addresses, disqualifying them from block canvassing.24 However, the\n\n\n\n\n       24\n            Master Address File Extract for Block Canvassing, Bureau of Census memorandum, October 7, 1998.\n\n                                                      18\n\x0cU.S. Department of Commerce                                                             Report OSE-12065\nOffice of Inspector General                                                               September 2000\n\nNovember 1997 Postal Service file had designated these units as residential, qualifying them for\nthe initial DMAF.25\n\nThe bureau agrees with our assessment that known duplicate addresses and addresses not\nverified by block canvassing and added solely on the basis of the November 1997 Postal\nService file are likely to be erroneous. The bureau has acted to remove these addresses which\ninclude 0.5 million duplicate addresses, 0.9 million obsolete postal service addresses, 1.5\nmillion addresses known as LUCA 98 provisional adds (out of 2.6 million), as well as 1.4\nmillion addresses recommended for deletion by block canvassing.26 We believe the bureau\nmade the right decision in taking steps to not include these addresses in nonresponse followup,\nwhich saved the cost of enumerator visits and enhanced the enumerators\xe2\x80\x99 ability to obtain actual\nresponses. Followup operations are finding additional nonexistent addresses. For example, as\nof June 14, 2000, the bureau has reported that its nonresponse followup operation found 5.9\nmillion nonexistent addresses. The final number of nonexistent housing units will be\ndetermined after completion of coverage improvement followup.\n\nB.      The Bureau\xe2\x80\x99s Policy for Determining Address Eligibility\n        During the Census Is Not Well-Defined\n\nAlthough the bureau has identified almost 20 million nonexistent and vacant addresses, 15.6\nmillion in nonresponse followup plus the 4.3 million addresses before nonresponse followup,\nthe decision about the criteria for which addresses to include in the final decennial count has not\nbeen made. The bureau does not have a written policy for determining address eligibility.\nBureau officials cite a \xe2\x80\x9cdouble-strike\xe2\x80\x9d or \xe2\x80\x9cdouble-kill\xe2\x80\x9d policy, whereby addresses have to be\ndeleted by two operations to be ineligible for remaining decennial operations. However, this\npolicy has not been documented in the decision memorandum series that establishes and\ncommunicates bureau policy for the decennial. Instead, the policy is embedded in DMAF\ndeliverability criteria found in software specifications, which often are not finalized until data\nprocessing for the operation is imminent or underway. We found that the actual policy implicit\nin DMAF specifications includes more than the double-strike rule described by decennial\nmanagers. For example, one set of DMAF specifications included criteria for deleting addresses\nnot found on Postal Service files delivered after September 1998 that are not part of the double-\n\n\n\n\n        25\n         Specification of the Decennial Master Address File Deliverability Criteria for Census 2000, DSSD\nCensus 2000 Procedures and Operations Memorandum Series #D-1, Bureau of Census, June 30, 1999.\n        26\n          Some overlap between the duplicates and the block canvass deletes and between the obsolete addresses\nand the LUCA provisional adds is possible.\n\n                                                      19\n\x0cU.S. Department of Commerce                                                           Report OSE-12065\nOffice of Inspector General                                                             September 2000\n\nstrike policy.27 In another set of DMAF specifications, addresses said to be seasonal or\nrecreational only needed one strike to be ineligible for CIFU operations. In another case in this\nsame specification, an address designated by block canvassing as nonresidential and as a delete\nby NRFU would still be included in CIFU because the block canvassing delete no longer\nqualified as a strike.28 Finally, we observed address eligibility decisions being made up to the\ndeadline for delivering NRFU-eligible addresses, and the bureau has yet to complete the DMAF\nspecification for addresses eligible for inclusion in the final decennial results.\n\nFor future censuses and surveys, including the 2010 decennial, we are recommending that the\nbureau develop an address eligibility policy in advance that defines and explains the rationale to\nbe used in selecting addresses. For the current decennial, we are recommending that the bureau\nissue a decision memorandum that explains the address eligibility policy for the final list of\naddresses to be included in the decennial.\n\nC.      The Bureau Should Use the MAF as a Management Tool\n\nData on the MAF provides valuable information for increasing the accuracy and completeness\nof addresses. For example, we found a street with approximately 140 townhouses built since\n1990 whose addresses were added by the Postal Service file, verified by LUCA 98, but not\nindicated on the MAF as verified by block canvassing. Especially where no LUCA 98\nparticipation occurred, areas not block canvassed may have missing or inaccurate addresses.\nThe bureau should identify areas with high percentages of addresses with no block canvassing\naction codes for future verification. As of April 2000, there were 9.5 million of these addresses\nin the DMAF. Possible reasons for their inclusion are that some were located by other\noperations or the housing units were built recently. However, areas with a preponderance of no\nblock canvassing action codes also could have missing housing units.\n\n\n\n\n        27\n           Definition of the Nonresponse Followup Universe for Census 2000, DSSD Census 2000 Procedures and\nOperations Memorandum Series #BB-3, Bureau of Census, November 1, 1999; Specification for Identifying the\nNonresponse Followup Universe for Census 2000, DSSD Census 2000 Procedures and Operations Memorandum\n#BB-4R, Bureau of Census, February 2, 2000; and Specifications for Updating the Decennial Master Address File\nin April 2000, DSSD Census 2000 Procedures and Operations Memorandum #D-6, Bureau of Census, March 27,\n2000.\n        28\n           Definition of the Coverage Improvement Followup Universe for Census 2000, DSSD Census 2000\nProcedures and Operations Memorandum Series #CC-2 Revision #4, Bureau of Census, June 28, 2000; Revision\nSpecification of the Coverage Improvement Followup Universe for Census 2000, DSSD Census 2000 Procedures\nand Operations Memorandum Series #CC-3, Revised #4, Bureau of Census, June 28, 2000; and Specification for\nUpdating the Decennial Master Address File on June 15, 2000, DSSD Census 2000 Procedures and Operations\nMemorandum Series #D-7 Revised, Bureau of Census, May 22, 2000.\n\n                                                     20\n\x0cU.S. Department of Commerce                                                  Report OSE-12065\nOffice of Inspector General                                                    September 2000\n\nThe bureau could also use MAF data to identify addresses with incorrect geocodes. For\nexample, we found two houses on an unpaved road not easily visible that were geocoded to the\nwrong block. These addresses were included in LUCA 98 and block canvassing. LUCA 98\nverified them, but block canvassing deleted them, perhaps because they could not be found in\nthe block to which they were assigned. Subsequently, LUCA 98 Field Verification verified that\nthey existed. No operation corrected the block number, and the units were delivered to the\nDMAF assigned to the wrong block. As of April 2000, there were over one million addresses in\nthe DMAF where LUCA 98 Field Verification verified addresses that block canvassing\nindicated should be deleted. We believe addresses that LUCA 98 Field Verification verified but\nblock canvassing recommended for deletion may be housing units that are assigned to the\nwrong block.\nWhile it is too late to carry out additional field operations to take advantage of MAF information\nfor this decennial, we are recommending that in the future, the bureau maximize use of MAF\ninformation to identify areas where addresses are more likely to be missed or incorrectly\ngeocoded.\n\nIII.   Improved Software Engineering Standards Could Improve Data Quality\n\nAddresses in the MAF must be consistent with TIGER to enable these addresses to be geocoded\nto geographic locations accurately. However, because MAF and TIGER were developed\nseparately and are not integrated, consistency between them cannot be easily maintained. TIGER\ndata can be modified without ensuring that both databases have accurate and consistent\ninformation, causing some decennial addresses to no longer link to TIGER. In addition, some\ndecennial addresses received a geocode from block canvassing but do not link to TIGER because\neither the geocode or the TIGER data was incorrect. As of April 2000, 4.5 million decennial\naddresses did not have a current link to TIGER. Although these housing units will still be in the\ncensus, they risk being inaccurately located. The bureau needs to take steps to ensure that\ndecennial addresses are geocoded accurately. In the future, the bureau intends to modernize\nMAF and TIGER to fix consistency issues, among other things. To make the modernization a\nsuccess, the bureau needs to adhere to its software engineering standards. These standards will\nguide analysts and developers toward building accurate, well-understood system components\nthat work together to safeguard data consistency.\n\nA.     Consistency Between MAF and TIGER Not Easily Maintained\n\nSoftware systems build the links between the MAF and TIGER databases either automatically or\nby allowing bureau staff to access TIGER on-line and add or modify TIGER data so that each\n\n\n\n\n                                               21\n\x0cU.S. Department of Commerce                                                             Report OSE-12065\nOffice of Inspector General                                                               September 2000\n\naddress links to a location and can be geocoded.29 Inconsistencies occur when the name of a\nstreet or the range of house numbers associated with a street is modified or a street is deleted in\nTIGER. Such changes cause addresses in the MAF that are already linked to that street to no\nlonger geocode. For example, in our field verification we found a street that was incorrectly\nlisted twice in TIGER. An apparent attempt to correct this problem resulted in the name of\nanother nearby street being changed to the same name as the street incorrectly listed twice. This\naction created three streets in TIGER with the same name: one real street, one fictitious street,\nand one incorrectly named street. The incorrectly named street had housing unit addresses on\nthe MAF and DMAF for which questionnaires had been delivered. The bureau official assisting\nus with the TIGER queries explained that because MAF and TIGER are not integrated, changes\nmade in TIGER can cause MAF addresses to no longer link to a TIGER block.\n\nThe addresses described above are typical of addresses that will not geocode to TIGER during\nproduction runs that occur during periodic Postal Service file updates. These addresses need to\nbe sent to the clerical part of the process, which then has to correct TIGER so that these\naddresses will once again geocode. During the April 2000 update of the MAF, 4.5 million\ndecennial addresses did not geocode to TIGER.\n\nAs bureau officials have explained to us, an operation has been implemented to resolve decennial\naddresses that do not geocode to TIGER when they are in blocks that border on or intersect\njurisdictional boundaries. Since this effort applies to the cases with the greatest impact on\ndecennial results, the bureau expects this operation to resolve the issue satisfactorily. However,\nthe remaining addresses are not geocodable and may be located in error or without any\ngeographic location when future TIGER updates are made. To prevent the loss of additional\naddress links to TIGER in Census 2000, we are recommending that the bureau ensure that any\nfurther changes made to TIGER during the decennial are verified with the MAF. After the\ndecennial, the bureau should devise methods to resolve all addresses that do not geocode to\nTIGER.\n\nB.      Software Engineering Standards Will Support Modernization\n\nThe bureau\xe2\x80\x99s FY 2001 budget submission requests funding to replace MAF and TIGER with a\nmodern system that will use current MAF and TIGER data. In the past, we have reported that the\nsoftware development approach used by the bureau for the decennial was not based on software\n\n        29\n           First, completely automated processing attempts to associate each address with a street and block in\nTIGER to geocode the address. Addresses that do not geocode on this first attempt are sent to a process called\nMaster Address File Geocoding Office Resolution. This process consists of manual and on-line procedures\nthrough which clerks use maps and other local information to identify the location of an address and the\ncorresponding block number. TIGER is then updated on-line with new streets, modified street names, or extended\naddress ranges.\n\n                                                      22\n\x0cU.S. Department of Commerce                                                             Report OSE-12065\nOffice of Inspector General                                                               September 2000\n\nengineering standards for documenting requirements, reviewing software specifications and\ndesign, and ensuring that rigorous, independent testing is carried out.30 This review found similar\nissues. Several specifications used by software programmers defined requirements in a narrative\nfashion, leaving room for ambiguity. After requesting written test plans, procedures, and results,\nwe were also told that testing is informal and not well documented. Bureau Geography Division\nofficials have stated that they would like to adhere to more formal software engineering standards\nand are exploring the use of CASE31 tools to help set up a \xe2\x80\x9ctop-down\xe2\x80\x9d software engineering\napproach to analyzing MAF and TIGER functionality before modernizing these systems.\nHowever, while CASE tools are useful, they are not a substitute for a complete life-cycle\ndevelopment process based on software engineering standards.\n\nThe Census Software Development Life Cycle manual32 documents a process and standards for\nthe first three phases of the life-cycle approach: system requirement definition, system\nrequirements analysis, and software requirements definition and analysis. The manual,\naugmented by software engineering standards for the balance of the software life-cycle, would\nprovide a helpful guide for developing a system that is soundly engineered and functionally\ncorrect. Using the rigor that these standards provide will help system developers realize system\ndesigns that meet user needs and maintain address data quality. Recognizing the importance of\nrigorous testing and quality assurance, the Decennial Management Division recently issued\nprocedures to improve its software development process for CIFU33 and is also improving the\nprocess for subsequent decennial software development. We believe that the bureau is doing the\nright thing by improving its software development process and that improvement should be\nexpanded to the MAF and TIGER systems. Therefore, we are recommending that the bureau\nadopt software engineering standards as part of the MAF and TIGER modernization.\n\n\n\n\n        30\n          PAMS/ADAMS Should Provide Adequate Support for the Decennial Census, but Software Practices\nNeed Improvement, OSE-11684, March 2000; Improvements Needed in Multiple Response Resolution to Ensure\nAccurate, Timely Processing for the 2000 Decennial Census, OSE-10711, September 1999; and Headquarters\nInformation Processing Systems for the 2000 Decennial Census Require Technical and Management Plans and\nProcedures, OSE-10034, November 1997.\n        31\n          Computer Assisted Software Engineering (CASE) describes automated tools that aid software\ndevelopers throughout all phases of the life cycle.\n        32\n             The Census Software Development Life Cycle, Bureau of the Census, November 7, 1994.\n        33\n             Coverage Improvement Follow-up Development Quality Assurance Plan, Bureau of the Census, May\n30, 2000.\n\n                                                       23\n\x0cU.S. Department of Commerce                                                               Report OSE-12065\nOffice of Inspector General                                                                 September 2000\n\n\nIV.     Success in Meeting Housing Unit Accuracy and Completeness Goals Should Be\n        Reported\n\nThe MAF is often referred to as the heart of Census 2000 because it is intended to identify the\nhouseholds that will be counted,34 and the quality of addresses for those households has a direct\nimpact on the accuracy and completeness of the decennial. As we noted earlier, the bureau\nfound that in the 1990 decennial, nearly one person in every three who were missed were not\ncounted because their housing unit was not in the address file. To the extent feasible, having an\naccurate and complete file will reduce the cost of the census. Duplicate and nonexistent\naddresses increase followup costs because of the added complexity to resolve duplicate\naddresses and the attempt to visit many nonexistent and, therefore, nonresponding addresses.\n\nRecognizing the importance of a complete and accurate list of housing units, the bureau has\ncreated housing unit coverage performance standards and methods to evaluate if they are met.\nThe Accuracy and Coverage Evaluation will measure housing unit coverage. However, the\nbureau has not clearly stated how evaluation results will be reported against the standard for this\ndecennial. The Government Performance and Results Act of 1993 requires agencies to report\nperformance goals, measures, and accomplishments to the President and the Congress in an\nAnnual Performance Plan and an Annual Program Performance Report. These documents are\nappropriate vehicles for reporting on this important data quality measure and can help ensure that\nthe bureau works to improve data quality to improve all of its operations that rely on the MAF.35\n\nA.      The Bureau Has Housing Unit Coverage Goals and Evaluation Plans\n\nThe bureau has already cited housing unit coverage standards in its reengineering plan. The goal\nwas a net housing unit coverage standard of 1 percent, with 2.5 percent missing and 1.5 percent\nerroneous.36 This standard was based on the bureau\xe2\x80\x99s goal of doing better than the 1990\n\n\n\n\n        34\n             Census 2000 MAF Program Master Plan draft, Bureau of Census, May 1999.\n\n        35\n          Performance Information Challenges, GAO/GGD-00-52. Also, see P.L. 103-62, August 3, 1993,\nGovernment Performance and Results Act and OMB Circular A-11, Part 2, July 1999 for information on the\npreparation and submission of strategic plans, annual performance plans, and annual program performance reports.\n        36\n            The reengineering plan cited no more than 1 percent duplicates as part of the overcoverage measurement\ngoal. We estimated the total overcoverage measurement goal, including duplicates and other nonexistent housing\nunits, as 1.5 percent based on the bureau\xe2\x80\x99s goal for missing housing units of 2.5 percent and net housing unit\nundercoverage measurement of 1 percent The Dress Rehearsal Report Card also cited a 1-percent net housing unit\nundercoverage standard for city-style addresses.\n\n                                                       24\n\x0cU.S. Department of Commerce                                                             Report OSE-12065\nOffice of Inspector General                                                               September 2000\n\ndecennial, in which the net housing unit undercoverage was 1.4 percent, with 3.4 percent missing\nand 2 percent erroneous.37\n\nThe bureau has a wealth of information that can be used to assess whether MAF performance\ngoals are being met and how performance can be improved. The bureau will have information\nabout MAF data quality through the Accuracy and Coverage Evaluation, which is being\nconducted to determine not only the number of people missed, but also the number of housing\nunits missed or incorrectly counted. In addition, the bureau\xe2\x80\x99s January 1999 operating plan\ncontains plans to obtain information about the quality of decennial data through evaluations of\nkey components of the decennial process. This plan outlines numerous evaluation studies and\nreports on how well operations worked in building the MAF. The evaluation results will be\nuseful in fully understanding the results of the Accuracy and Coverage Evaluation, including\nidentifying successful operations and those that should be improved.\n\nB.      The Annual Performance Plan and Report Should Be Used to Report Housing Coverage\n\nThe Government Performance and Results Act requires federal agencies to report to the President\nand the Congress on performance goals for each major program activity and accomplishments in\nmeeting those goals. Under this law, the head of each agency is to submit a plan and report each\nyear on program activities. The plans are to establish performance goals to define the level of\nperformance to be achieved by each program activity. The performance reports are to contain\ntwo main parts: (1) a report on the actual performance achieved as compared to the performance\ngoals expressed in the performance plans and (2) the plans and schedules to achieve those goals\nthat were not met. If a performance goal becomes impractical or infeasible, the agency is to\nexplain why that is the case and what legislative, regulatory, or other actions are needed to\naccomplish the goal, or whether the goal ought to be modified or discontinued. Finally, the\nreports should relate performance measurement information to program evaluation findings in\norder to give a clear picture of the agency\xe2\x80\x99s performance and its efforts at improvement.\n\nThe bureau has identified ways to measure housing unit coverage but has not stated how results\nwill be reported to decision-makers for this decennial. One of the decennial\xe2\x80\x99s main goals is to\nprovide quality data. According to the Department\xe2\x80\x99s Fiscal Year 2000 Annual Performance Plan,\nthe bureau plans to measure success in achieving data quality by its ability to achieve a 0.1-\npercent net undercount.38 Because the decennial is a census of housing as well as of the\n\n\n        37\n          The bureau\xe2\x80\x99s 1990 Housing Unit Coverage Study, Preliminary Research and Evaluation Memorandum\nNo. 193. Net housing unit undercoverage of 1.4 percent was derived by subtracting the bureau reported erroneous\nenumerations of 2 percent from the bureau report of omissions of 3.4 percent.\n        38\n             The Department of Commerce Annual Performance Plan, Fiscal Year 2000.\n\n                                                      25\n\x0cU.S. Department of Commerce                                                          Report OSE-12065\nOffice of Inspector General                                                            September 2000\n\npopulation, the bureau\xe2\x80\x99s portion of the Department\xe2\x80\x99s Annual Performance Plan should include\nhousing coverage as a performance measurement. The bureau planned to report on its Census\n2000 accomplishments in a Report Card similar to that issued for the dress rehearsal; however,\nthese plans have been canceled. The Annual Performance Plan and Annual Program\nPerformance Report offer appropriate vehicles for providing this information.\n\n        Reporting Should Include Overcoverage and Undercoverage Components\n\nIn reporting MAF goals and performance, the bureau should include not just net housing unit\ncoverage, but also the overcoverage and undercoverage components in order to provide a\ncomplete report on MAF data quality.39 In the 1998 Dress Rehearsal, the bureau only reported its\naddress goals in terms of a net housing unit undercoverage of 1.5 percent. This means that\nundercoverage and overcoverage combined, as calculated using the bureau\xe2\x80\x99s estimation\ntechniques, should result in a net housing unit undercoverage no greater than 1.5 percent. For\nexample, in Columbia, South Carolina, the bureau reported net housing unit undercoverage of\n2.9 percent, which did not meet the net housing unit undercoverage standard of 1.5 percent.\nHowever, the two components that comprise the 2.9 percent net undercoverage measure provide\nimportant insight into the extent of overcoverage and undercoverage. The MAF missed 13\npercent of the housing units in Columbia and erroneously included 10.4 percent of the housing\nunits.40 It is also possible for there to be both a large percentage of undercoverage and a nearly as\nlarge percentage of overcoverage, yet the net percentage of undercoverage could fall within the\nstandard. The details behind the overall measure provide important insights into the extent of\nmissing housing units and housing units counted in error that can be obscured by only reporting\nnet undercoverage.\n\nTherefore, we are recommending that the bureau include the housing unit coverage standards and\nresults in achieving them\xe2\x80\x94including overcoverage and undercoverage\xe2\x80\x94as a second performance\nmeasure in its input to the Department\xe2\x80\x99s Annual Performance Plan and Program Performance\nReport for fiscal year 2002. By including housing unit coverage as a measure of decennial data\nquality, the bureau will attain a vehicle for reporting an important indicator of decennial success\nand will link the housing unit coverage standards with the evaluation of housing unit coverage.\nTo justify steps to improve coverage, we are also recommending that the bureau retain the\nhousing unit coverage measure as input to subsequent Annual Performance Plans and Program\nPerformance Reports.\n\n\n\n        39\n             See Reengineering Plan .\n        40\n         Results of the Housing Unit Matching Phase of the Integrated Coverage Measurement, Bureau of the\nCensus, DSSD ICM Dress Rehearsal Results Memorandum Series Number HU-1, September 9, 1998.\n\n                                                    26\n\x0cU.S. Department of Commerce                                                 Report OSE-12065\nOffice of Inspector General                                                   September 2000\n\n       MAF Quality Improvement Program Should Be Continued\n\nTo ensure that it has the data needed for future annual reporting, the bureau should continue its\nquality assurance of MAF completeness and accuracy in terms of undercoverage and\novercoverage by using either of the methods discussed below. In 1997, the bureau\xe2\x80\x99s evaluation\nstaff designed an approach that would provide an annual measurement of MAF accuracy and\ncompleteness based on the methodology used in 1990 for statistical sampling and adjustment.\nThe bureau used this approach in MAF quality improvement programs in 1997 and 1998. The\n1997 MAF Quality Improvement Program Pilot Study measured the completeness and accuracy\nof MAF residential addresses geocoded to a block in six counties. This pilot study concluded\nthat, with a few modifications, the statistical adjustment methodology would provide an\neffective measurement of MAF housing unit coverage. The 1998 study confirmed relatively\nhigh MAF error rates and called for significant improvement in the MAF before the decennial.\nAlthough these studies provided important insights into MAF quality, the 1999 study designed\nto measure the results of the reengineered MAF-building approach was canceled. Census\nofficials cited a lack of resources and schedule concerns. We believe, however, that these types\nof studies provide an important measurement of MAF accuracy and completeness. Also, the\nPopulation Division\xe2\x80\x99s methodology for comparing numbers of addresses in a county to\nindependent demographic benchmarks has proved useful in determining areas with potential\ncoverage problems. Therefore, we are recommending that the bureau continue projects designed\nto evaluate MAF housing unit coverage, which should be used throughout the next decade as a\nbenchmark for basing MAF improvement and maintenance operations.\n\n\n\n\n                                               27\n\x0cU.S. Department of Commerce                                                Report OSE-12065\nOffice of Inspector General                                                  September 2000\n\n\n\n                                  RECOMMENDATIONS\n\nWe recommend that the Director, Bureau of the Census, take the necessary actions to improve\naddress data quality for Census 2000 and future censuses and surveys, including the following:\n\n1.     Explore causes for continuing overcoverage and undercoverage of housing units\n       and use the resulting information as lessons learned when planning future MAF\n       improvement operations.\n\n       The Census Bureau concurred with this recommendation.\n\n2.     Develop a MAF-building strategy that ensures sufficient time for MAF\n       improvement operations and include as part of the strategy:\n\n       1.     A consistent policy for address eligibility for improvement operations.\n\n       2.     An approach for verifying addresses without block codes.\n\n       The Census Bureau concurred with this recommendation.\n\n3.     Study causes of duplicate addresses supplied by different sources, such as the Postal\n       Service file, local/tribal governments, and block canvassing and implementing\n       methods to prevent duplicates.\n\n       The Census Bureau concurred with this recommendation.\n\n4.     Develop an address eligibility policy that defines in advance the criteria to be used\n       in selecting addresses during future censuses and surveys.\n\n       The Census Bureau concurred with this recommendation.\n\n5.     Issue a decision memorandum that explains the address eligibility policy for the\n       final DMAF delivery of addresses to be included in Census 2000.\n\n       The Census Bureau stated that it agrees that a Decision Memorandum should be issued\n       describing the general policy for additions and deletions from the DMAF.\n\n       We believe that the memorandum should also explain the specific criteria for including\n       addresses in Census 2000.\n\n\n                                              28\n\x0cU.S. Department of Commerce                                              Report OSE-12065\nOffice of Inspector General                                                September 2000\n\n6.    Use information in the MAF as a management tool in the future to increase the\n      completeness and accuracy of the address file (for example, to identify areas where\n      addresses are more likely to be missing or incorrectly geocoded).\n\n      The Census Bureau concurred with this recommendation.\n\n      We believe that the bureau should also consider additional techniques, such as applying\n      address history data to locate geographic areas where addresses are likely to be\n      missing and to pinpoint addresses likely to be geocoded in error for inclusion in its\n      planned work.\n7.    Ensure that any further TIGER changes are verified with the MAF so that no\n      additional decennial addresses lose their link to TIGER for Census 2000.\n\n      The Census Bureau concurred with this recommendation and provided a clarification to\n      the draft report. Specifically, the bureau noted that for Census 2000, MAF addresses\n      were not \xe2\x80\x9clost\xe2\x80\x9d from the census if they for any reason ceased to match to the TIGER\n      database; block codes derived from other sources, such as field work, overrode or\n      substituted for TIGER block codes according to a documented scheme.\n\n      In response to the bureau\xe2\x80\x99s concerns, we clarified our report where appropriate.\n\n8.    Devise methods to resolve addresses that do not geocode to TIGER.\n\n      The Census Bureau concurred with this recommendation.\n\n9.    Adopt software engineering standards as part of the MAF and TIGER\n      modernization.\n\n      The Census Bureau concurred with this recommendation.\n\n10.   Report housing unit coverage standards and results, broken out by their\n      overcoverage and undercoverage components:\n\n      a.     As performance measures in the bureau\xe2\x80\x99s input into fiscal year 2002 and\n             subsequent Departmental Annual Performance Plans.\n\n      b.     As performance results in the bureau\xe2\x80\x99s input into fiscal year 2002 and\n             subsequent Departmental Annual Program Performance Reports.\n\n\n\n                                            29\n\x0c      The Census Bureau concurred with the recommendation that separate measurements\n      should be made and reported for MAF overcoverage and undercoverage. The bureau\n      further agreed that there should be a continuous process for monitoring and improving\n      MAF quality and coverage. However, the bureau stated that it would not be cost effective\n      or practical to make annual national or small area MAF coverage measurements because\n      doing so would divert an unacceptable level of key staff resources from planning and\n      implementing actual MAF/TIGER modernization improvements. The bureau noted that\n      planners for the American Community Survey and other intercensal demographic data\n      collections have identified no program requirement for annual MAF overcoverage/\n      undercoverage measures. According to the bureau these measures would be extremely\n      useful at wider intervals and, funding permitting, it plans to generate these at several\n      points in the decade in preparing for the 2010 decennial census.\n\n      We understand the bureau\xe2\x80\x99s concerns about the cost of implementing an annual\n      national or small area MAF coverage measurement. However, other methods, such as\n      comparing tallies of MAF addresses to demographic estimates as employed by\n      Population Division, may serve as an alternative.\n\n11.   Continue projects designed to evaluate MAF housing unit coverage that can be used\n      throughout the next decade as a benchmark for basing MAF improvement and\n      maintenance operations.\n\n      The Census Bureau concurred with this recommendation.\n\n\n\n\n                                             30\n\x0c                     Appendix A.\n\n           Acronyms Used in This Report\n\n\nCIFU     Coverage Improvement Followup\n\nDMAF     Decennial Master Address File\n\nDSCMO    Decennial Systems and Contracts Management Office\n\nDSSD     Decennial Statistical Studies Division\n\nGAO      General Accounting Office\n\nICM      Integrated Coverage Measurement\n\nLUCA     Local Update of Census Addresses\n\nMAF      Master Address File\n\nNRFU     Nonresponse Followup\n\nPMP      Program Management Plan\n\nPRED     Planning, Research, and Evaluation Division\n\nTEA      Type of Enumeration Area\n\nTIGER\xc2\xae   Topologically Integrated Geographic Encoding and Referencing\n         System\n\n\n\n\n                          31\n\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c'